32 research outputs found

    Model Rule: Multilevel And Multidimension Association Rule untuk Analisa Market Basket Pada PT. Maha Agung

    Get PDF
    PT. Maha Agung adalah sebuah perusahaan distribusi yang memiliki gudang distribusi tersebar di lima daerah yang berbeda. Karena besarnya area pasar, perusahaan bukan hanya membutuhkan informasi keterkaitan antara produknya saja, akan tetapi juga faktor waktu, wilayah pemasaran, profil pelanggan, dan lain sebagainya. Oleh sebab itu model rule stardart yang dihasilkan oleh metode analisa market basket, yaitu: single level, multilevel serta multidimesion association rule tidak dapat digunakan. Pada penelitian ini diusulkan untuk mengkombinasi dua macam model rule, yaitu: Multilevel Association Rule serta Multidimesion Association Rule menjadi bentuk lain. Aplikasi yang dibuat pada penelitian ini menghasilkan sebuah model association rule baru yang kita namakan "Multilevel And Multidimension Association Rule". Pemanfaatan model association rule baru ini untuk menjawab kebutuhan PT. Maha Agung ini terbukti tepat. Hal ini dapat dari hasil kuisioner calon pemakai yang cukup baik, yaitu sebesar 89.6%. Kata Kunci: Data Mining, Analisa Market Basket, Multilevel And Multidimension Association Rul

    Handwritten Javanese Character Recognition Using Several Artificial Neural Network Methods

    Get PDF
    Javanese characters are traditional characters that are used to write the Javanese language. The Javanese language is a language used by many people on the island of Java, Indonesia. The use of Javanese characters is diminishing more and more because of the difficulty of studying the Javanese characters themselves. The Javanese character set consists of basic characters, numbers, complementary characters, and so on. In this research we have developed a system to recognize Javanese characters. Input for the system is a digital image containing several handwritten Javanese characters. Preprocessing and segmentation are performed on the input image to get each character. For each character, feature extraction is done using the ICZ-ZCZ method. The output from feature extraction will become input for an artificial neural network. We used several artificial neural networks, namely a bidirectional associative memory network, a counterpropagation network, an evolutionary network, a backpropagation network, and a backpropagation network combined with chi2. From the experimental results it can be seen that the combination of chi2 and backpropagation achieved better recognition accuracy than the other methods

    A Multi-type Classifier Ensemble for Detecting Fake Reviews Through Textualbased Feature Extraction

    Get PDF
    The financial impact of online reviews has prompted some fraudulent sellers to generate fake consumer reviews for either promoting their products or discrediting competing products. In this study, we propose a novel ensemble model - the Multitype Classifier Ensemble (MtCE) - combined with a textual-based featuring method, which is relatively independent of the system, to detect fake online consumer reviews. Unlike other ensemble models that utilise only the same type of single classifier, our proposed ensemble utilises several customised machine learning classifiers (including deep learning models) as its base classifiers. The results of our experiments show that the MtCE can adequately detect fake reviews, and that it outperforms other single and ensemble methods in terms of accuracy and other measurements in all the relevant public datasets used in this study. Moreover, if set correctly, the parameters of MtCE, such as base-classifier types, the total number of base classifiers, bootstrap and the method to vote on output (e.g., majority or priority), further improve the performance of the proposed ensemble

    Segmentation of Hanacaraka Characters using Double Projection Profile and Hough Transform

    Get PDF
    In doing segmentation of Hanacaraka character, Javanese ancient character, one of Indonesian’s ethnic ancient character in Java island, the difficulties that occur is the inconsistency of the space between lines, the size of the character and the thickness. Inconsistencies between row spacing and letter size are caused by the letters of the pair, the last vowel and consonant letters in one phoneme. While the thickness is inconsistent due to the writing style of the Hanacaraka itself. Image Preprocessing needs to be done to get input without skew. To improve skewed text documents, we used Hough transforms to predict the edges of the text area. After that, to segment the line and then continue with segmentation of each character, horizontal projection profile is used and then proceed with vertical. The result of this segmentation method is good for printed documents. Segmentation process of handwriting documents has difficulty because each row in the document is uneven and very tight between the rows. Those matters cause them overlap. When the line segmented wrongly, the entire character on the line will be not segmented as well. This problem can be eliminate using connectivity test. Before this, it need to segment the line with the overlap area. The character part of below or above the main character can be eliminate because it is not connected to the main character

    Comparison between Shape-Based and Area-Based Features Extraction for Java Character Recognition

    Get PDF
    Java language is one of the local languages are widely used in Indonesia. Java language is widely used by resident of the island of Java. Java language has special character called Java character. In this research we compare features extraction which will be used to perform the recognition of Java character. The accuracy of recognition is greatly aected by accuracy of features extraction. Because if there are a lot of similar features between one character with other characters, may cause the system to recognize as the same characters. In this research, we compare between shape-based features and area-based features. Shape-based features consist of curves, lines, and loop composing a Java character. The number of curves, lines, and loop will vary between characters with other characters. For area-based features extraction, each character divide into 9 9 equal regions. In each region, the number of pixels will be calculated. From experimental results, area-based features extraction gives better result than shape-based features extraction. This experiment is done by using probabilistic neural network (PNN) as a method of recognition. By using shape-based features extraction, the system only has recognition accuracy below 20%, but using area-based features extraction, the recognition accuracy can achieve more than 60%

    PERANCANGAN DATA WAREHOUSE DAN OLAP TOOLS PADA PERUSAHAAN “X”

    Get PDF
    Pada saat ini perusahaan “X” menggunakan sistem manual dan menggunakan sistem basis data sederhana untuk melakukan pencatatan mengenai transaksi-transaksi yang dilakukan oleh perusahaan. Sistem transaksi pembelian dan penjualan masih menggunakan catatan manual dan dipadukan dengan aplikasi basis data sederhana untuk menyimpan data dan membuat laporan bagi manajer perusahaan. Sistem yang sudah digunakan perusahaan saat ini tidak mendukung analisis data dan pengambilan keputusan.Pada penelitian ini dibuat sebuahperangkat lunak pendukung pengambilan keputusan dengan menggunakan data warehouse dan OLAP Tools. Aplikasi ini meliputi proses transformasi dari database asal ke database star schema, analisis data melalui tabel dan grafik, pembuatan kubus virtual dan fisikal, pembuatan report. Aplikasi ini menggunakan Borland Delphi 7 untuk bahasa pemrograman dan desain interface, Microsoft SQL Server 2000 sebagai tempat penyimpanan data.Informasi yang dihasilkan dari sistem tersebut adalah pengukuran data penjualan, data retur jual, data pembelian, data retur beli, data pembatalan pembelian oleh pelanggan, dan data pemesanan yang dilakukan oleh pelanggan. Data-data tersebut dapat ditampilkan dari beberapa sudut pandang yang berbeda. Dengan melihat data dari beberapa sudut yang berbeda ini dapat membantu manajer untuk menganalisis lebih banyak hal dari data yang ad

    Multi-level particle swarm optimisation and its parallel version for parameter optimisation of ensemble models: a case of sentiment polarity prediction

    Get PDF
    Ensemble learning is increasingly used in sentiment analysis. Determining the parameter settings of ensemble models, however, is not easy. Besides its own parameters, an ensemble model has base-predictors that have their individual parameters. Some ensemble models use a specific base-predictor and could be optimised using standard metaheuristics such as the Particle Swarm Optimisation (PSO) approach. Optimising ensemble models with multiple base-predictor candidates is more complicated and challenging, as there are multiple options to choose from. We therefore propose Multi-Level PSO (ML-PSO) and Parallel ML-PSO (PML-PSO) to optimise the parameters of ensemble models, especially those with multiple base-predictors, for sentiment analysis. The idea is to utilise multiple PSOs as particles of the main PSO. The main PSO optimises ensemble-model parameters and determines the best base-predictor, whereas PSOs within it optimise the corresponding base-predictor�s parameters. Experimental results using Bagging Predictors as the underlying ensemble model show that ML-PSO can improve prediction accuracy, while PML-PSO is able to speed up the processing time and further improve the accuracy

    Resampling imbalanced data to detect fake reviews using machine learning classifiers and textual-based features

    Get PDF
    Fraudulent online sellers often collude with reviewers to garner fake reviews for their products. This act undermines the trust of buyers in product reviews, and potentially reduces the effectiveness of online markets. Being able to accurately detect fake reviews is, therefore, critical. In this study, we investigate several preprocessing and textual-based featuring methods along with machine learning classifiers, including single and ensemble models, to build a fake review detection system. Given the nature of product review data, where the number of fake reviews is far less than that of genuine reviews, we look into the results of each class in detail in addition to the overall results. We recognise from our preliminary analysis that, owing to imbalanced data, there is a high imbalance between the accuracies for different classes (e.g., 1.3% for the fake review class and 99.7% for the genuine review class), despite the overall accuracy looking promising (around 89.7%). We propose two dynamic random sampling techniques that are possible for textual-based featuring methods to solve this class imbalance problem. Our results indicate that both sampling techniques can improve the accuracy of the fake review class—for balanced datasets, the accuracies can be improved to a maximum of 84.5% and 75.6% for random under and over-sampling, respectively. However, the accuracies for genuine reviews decrease to 75% and 58.8% for random under and over-sampling, respectively. We also discover that, for smaller datasets, the Adaptive Boosting ensemble model outperforms other single classifiers; whereas, for larger datasets, the performance improvement from ensemble models is insignificant compared to the best results obtained by single classifiers

    Combining Sentiment Lexicons and Content-Based Features for Depression Detection

    Get PDF
    Numerous studies on mental depression have found that tweets posted by users with major depressive disorder could be utilized for depression detection. The potential of sentiment analysis for detecting depression through an analysis of social media messages has brought increasing attention to this field. In this article, we propose 90 unique features as input to a machine learning classifier framework for detecting depression using social media texts. Derived from a combination of feature extraction approaches using sentiment lexicons and textual contents, these features are able to provide impressive results in terms of depression detection. While the performance of different feature groups varied, the combination of all features resulted in accuracies greater than 96% for all standard single classifiers, and the best accuracy of over 98% with Gradient Boosting, an ensemble classifier

    Recognition of Hanacaraka Characters in Old Manuscripts Using Feed-Forward Networks and Elman Recurrent Networks

    Get PDF
    The Javanese language has a unique set of letters called Hanacaraka characters, which is different compared to the Latin alphabet. Since modern Javanese ethnics of Indonesia don’t use it anymore for formal conversation and education, this language, especially its Hanacaraka characters, begins to extinct. For the preservation purpose of old manuscripts in Hanacaraka characters, we create a system that can recognise Javanese characters automatically from an old manuscript or writing. For this system, we investigated and employed several methods of image processing, features extractions and machine learning for character recogniser. In this paper, we present the result of our investigation of traditional feed-forward neural networks and Elman recurrent networks and comparing their accuracies to obtain the best recogniser. We also compare the results with the accuracies of the probabilistic neural network and induction tree from our previous experiments. From the comparison, we found that Elman recurrent network outperforms the performance of other algorithms, with accuracy more than 97% for data training and 85% for data testing
    corecore